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Abstract 

We consider a Bayesian game of pure informational externalities, in which a group of agents 
learn a binary state of the world from conditionally independent private signals, by repeatedly 
observing the actions of their neighbors in a social network. 

We show that the question of whether or not the agents learn the state of the world depends 
on the topology of the social network. In particular, we identify a geometric "egalitarianism" 
condition on the social network graph that guarantees learning in infinite networks, or learning 
with high probability in large finite networks, in any equilibrium of the game. We give examples 
of non-egalitarian networks with equilibria in which learning fails. 

Keywords: Bayesian learning, rational expectations, informational externalities, social net- 
works, aggregation of information. 

1 Introduction 

We consider a Bayesian game played on a social network. There is an unknown state of the world 
S G {0, 1}, and each of a finite or countably infinite group of agents receives a conditionally i.i.d. 
signal that carries information on S. In each discrete time period, each agent chooses an action 
in {0, 1}, where the utility is one if the action equals S and zero otherwise, and is discounted 
exponentially, by a common rate. The agents gain no further direct indication of the merit of their 
actions, but may learn by observing the actions of their neighbors in a social network. Externalities 
in this game are therefore purely informational; an agent's utility does not depend directly on the 
actions of others, and so agents play sub-optimally only in hope of extracting additional information 
from their neighbors' future actions. 

This game can be thought of as modeling repeated lifestyle choices, where much can be learned 
from others, and utilities (e.g., longevity) are only revealed after a long amount of time. Alter- 
natively, consider the choice of religion: this is evidently a repeated choice which is susceptible to 
social influence, and, arguably, the payoff is only delivered in the afterlifd^ 

We consider the question of learning. When the number of agents is infinite, when is it the case 
that the agents learn S and all converge to the correct action? In finite networks, when does learning 
happen with high probability? We show that learning may or may not occur, depending on the 
geometry of the social network. In particular, we show that in infinite networks that are egalitarian 
- in a sense we define below - learning occurs with probability one in any equilibrium. We require the 

*U.C. Berkeley and Weizmann Institute. E-mail: mossel@stat.berkeley.edu. Supported by NSF award DMS 
1106999, by ONR award N000141 110140 and by ISF grant 1300/08. 

^U.C. Berkeley. Supported by a Sloan Research Fellowship in mathematics and by NSF award DMS 1208339 

* Weizmann Institute. Supported by ISF grant 1300/08. Omer Tamuz is a recipient of the Google Europe Fellow- 
ship in Social Computing, and this research is supported in part by this Google Fellowship. 

1 We would like to thank Prof. Najeeb Ali for pointing out this application. 



1 



distributions of private signals to be non-atomic, but do not impose unbounded likelihood ratios: 
learning occurs in egalitarian networks even for (informative) signals with bounded likelihood ratios. 
We likewise show that, for large, egalitarian, finite networks, learning occurs with high probability 
in any equilibrium; under a uniform egalitarianism constraint, as the size of the network tends 
to infinity, the probability of learning approaches one. We provide examples of non-learning in 
non-egalitarian networks. 

We represent social networks as directed graphs. The nodes of the graph are the agents, and 
there is an edge between agent i and j if i observes the actions of j. The social networks we consider 
are strongly connected: for each pair of agents i and j there is a directed path from i to j in the 
social network graph. The out-degree of agent i is the number of agents it observes. We assume 
that it is finite, but allow infinite in-degrees, so that some agents may be observed by infinitely 
many others. 

We show that learning occurs in infinite graphs that are egalitarian in the following sense. First, 
we require that out-degrees are bounded: there is some number d such that no agent observes more 
than d others. Second, we require that there exists a number L such that, if there is a path of length 
k from agent i to agent j, then there is a path from j back to i, of length at most L ■ k. In this 
case, we call the graph .L-connected. Equivalently, a graph is L-connected if, whenever i observes j, 
then there is a path of length at most L from j back to i. In a sense, and similarly to the bounded 
degree requirement, L-connectedness ensures that the "importance" of different agents does not 
vary too much; an agent is never too far removed from those who observe it. We therefore think 
of both of these conditions as describing a network that is, to a certain extent, egalitarian. From 
a mathematical viewpoint, these conditions arise naturally as a compactness condition in a certain 
topology of graphs. 

We provide two example of non-learning. In the first example the social network graph has 
bounded out-degrees, but is not L-connected. We show here that, for a low enough discount factor, 
myopic behavior is an equilibrium that leads to convergence to the wrong action with probability 
that is small, but bounded away from zero. 

In the second example the graph is undirected, so that whenever i observes j then j observes 
i; the graphs is therefore l-connected. However, it does not have bounded degrees, and so is not 
"egalitarian". Here we construct a rather involved equilibrium, with non-myopic behavior, which 
again leads to non-learning with probability bounded away from zero. 

Learning on social networks is a widely studied field; a complete overview is beyond the scope 
of this paper, and so we shall note only a few related studies. Bala and Goyal jl] study a similar 
model, and show results of learning or non-learning in different cases. Their model is crucially 
different from ours, making the results incomparable: it is boundedly-rational, and so agents do 
not take into account the choices of their neighbors when forming their beliefs. In their model, 
agents gain additional information directly by taking certain actions, and so their strategic behavior 
is a product of an individual explore-exploit dilemma, rather than social manipulation. 

Other notable models of learning through repeated social interaction are those of DeGroot [9] , 
Ellison and Fudenberg [TT], DeMarzo, Vayanos and Zwiebel [10J and Golub and Jackson [14;. All 
of these papers describe models of agents that are not rational but rather use some rule of thumb. 
In fact, to the best of our knowledge, no previous work considers learning, in repeated interaction, 
on social networks, in a fully rational, strategic setting. In a previous paper [19J, we consider the 
same question, but for myopic agents, who, again, display no strategic behavior. 

The study of agreement (rather than learning) on social networks is also related to our work, 
and in fact we make crucial use of the work of Rosenberg, Solan and Vieille [23], who prove an 
agreement result for a large class of games with informational externalities played on social networks. 
This is a field of study founded by Aumann's "Agreeing to disagree" paper [3] and elaborated on 
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by Sebenius and Geanakoplos [25], McKelvey and Page [16] , Parikh and Krasucki [23], Gale and 
Kariv [12], Menager |17j and Mueller- Frank |22j , to name a few. The moral of this research is that, 
by-and-large, rational agents eventually reach consensus, even in strategic settings. Indeed, the 
fact that agents eventually agree is a condition for learning, since otherwise it is impossible that all 
converge to the same action. We elaborate on the work of [Mj and show that when private signals 
are non-atomic then, asymptotically, agents agree on best responses. 

Another strain of related literature is that of herd behavior, started by Banerjee |5] and Bikhchan- 
dani, Hirshleifer and Welch [7], with significant generalizations and further analysis by Smith and 
S0rensen [26J, Acemoglu, Dahleh, Lobel and Ozdaglar [1| and recently Lobel and Sadler |15j . Here, 
the state of the world and private signals are as in our model, and agents are rational. However, in 
these models agents act sequentially rather than repeatedly, and so no non-trivial strategic behavior 
arises. 

In Section [2] we define out model formally and state our results. In Section [3] we prove our 
results, and in Section [4] we provide examples of non-learning. 



2 Model and results 
2.1 Definitions 

The following definition is adapted from [21 



Definition 2.1. Let Q be a measurable space of private signals, and let [Iq and [i\ be mutually 
absolutely continuous measures on S7. Let W ~ 5/xo + \n\, and let Z = j^(W). We assume that 
fiQ, jjL\ are such that the distribution of Z is non-atomic. 

Here Z can be interpreted as the likelihood ratio of the events that W came from either hq or 
fix. Notice that in particular our assumption implies that 7^ jX\, since otherwise Z is equal to 
one with probability one. 

Definition 2.2. Let V be a countable set of agents, which we take to equal {l,2,...,n} in the 
finite case and N = {1, 2, . . .} in the infinite case. 

Let {0, 1} be the set of possible values of the state of the world 5, and let 8q and 5\ be the 
distributions on {0, 1} such that <5o(0) = = 1. 

Let Wi G Q be agent i 's private signal, and denote W = (Wi, W%, ■ ■ ■)■ 

Let 



(S,W) 



where 



Equivalently, P [S = 1] = P [S = 0] = 1/2, and, conditioned on S, Wi are i.i.d. ^s- We shall 
henceforth use P [•] and E [•] to denote probabilities and expectations in the distribution described 
above. We will implicitly extend this probability space when we allow mixed strategies. 

Definition 2.3. Agent i's private belief L is defined by 

Ii = P[S = l\Wi] . 

Note that h is well defined since //q and m are mutually absolutely continuous, and that by 



the condition in Definition 2.1 L it has a non-atomic distribution. 
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Definition 2.4. A social network G = (V,E) is a directed graph. We assume throughout that it 
is simple and strongly connected. Let the set of neighbors of i E V be di = {j : £ E} U {i} 

(i.e., di includes i). The out-degree of i is equal to \di\, and is assumed to always be finite. 

Finite out-degrees mean that an agent observes the actions of a finite number of other agents. 
We do allow infinite in-degrees; this corresponds to agents whose actions are observed by infinitely 
many other agents. 

We consider the discrete time periods t = 0, 1, 2, . . ., where in each period each agent i £ V has 
to choose one of the actions in {0, 1}. This action is a function of agent i's private belief, as well as 
the actions of its neighbors in previous time periods, and so can be thought of as a function from 
[0,1] x {0,l}l 9i l-* to {0,1}. 

Definition 2.5. A pure strategy at time t of an agent i £ V is a B or el-measurable function 
q\ : [0,1] x {0,1}'^''* — > {0,1}. A pure strategy of an agent i is the sequence of functions 
q l = ((/g, q\, . . .), where q\ is i's pure strategy at time t. A mixed strategy Q l of agent i is 
a pure-strategy-valued random variable. We shall henceforth refer to mixed strategies simply as 
strategies. 

A strategy profile is the set of (mixed) strategies Q = {Q l : i £ V}. 

Note that strategy profiles are in general mixed, and so P will denote averaging over this 
additional randomness. Each random variable Q % will be taken to be independent of all other Q 1 , 
as well as all the other previously defined random variables. 

Definition 2.6. Fix a social network G = (V,E) and a strategy profile Q. The action of agent i 
at time t is denoted by A\ £ {0, 1}. Denote the history of actions of the neighbors of i before time 
t by A^Q t ^ = {A J t , :t' <t,j € di}. The actions are recursively defined as follows: 

Al = Ai(G,Q) = Ql(I i ,Afl t) ). 

Note that we limit the action to be a function of the private belief /j, as opposed to the 
private signal W%. However, as will become apparent when we next define the agents' utilities, a 
strategy that discriminates between private signals that induce identical private beliefs can always 
be replaced by an equivalent strategy that does not. 

Definition 2.7. Let < A < 1 denote the agents' common discount factor. Given a social 
network G and strategy profile Q, agent i's utility at time t, Uij, is given by 

Ui, t = u i>t (G,Q)=F[At{G,Q) = S]. 

Agent i 's utility itj is given by 



i(G,\,Q) = (1-A)^AV*(G,Q). 



Ui = Ui( 

t=0 



Note that Ui £ [0, 1], due to the normalization factor (1 — A). 

Definition 2.8. A game Q is a 4-tuple (fiQ, fj,\, A, G) consisting of two measures, a discount factor 
and a social network graph, satisfying the conditions of the definitions above. The agents ' strategies 



and utilities are those of Definitions \2.h\ and 2. 7 
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Definition 2.9. Given a game Q = (/io, fii, A, G), strategy profile Q is an equilibrium if, for 

every agent i G V it holds that 

Ui(Q) > Ui(R), 
for any R such that R 3 = Q 3 for all j ^ i inV. 

Definition 2.10. Fix a game Q = (/Uo, m, A, G) and a strategy profile Q. 

Let Ci be the set of actions (i.e., a subset of {0, 1} ) that i takes infinitely often: 

Cj = Ci(G, Q) = {s G {0, 1} : A\(G, Q) = s for infinitely many values of t}. 

We call Ci the infinite action set of i. When Ci = Cj for all i,j then we denote C = C% = Cj. 

We show below in Theorem [3] that there always exists a random variable C = C(G,Q) such 
that almost surely Cj = C, for all agents i. 

Definition 2.11. Let G = (V,E) be a directed graph. A (directed) path of length k from i G V to 
j G V in G is sequence of k + 1 nodes i±, . . . , ik+i such that (i n , i n +i) G E for n = 1, . . . , k, and 
where i\ = i and it+i = j ■ 

Definition 2.12. A directed graph G is L-connected if, for each (i,j) G E, there exists a path of 
length at most L in G from j to i. 

Equivalently, G is L-connected if whenever there exists a path of length k from i to j, there 
exists a path of length at most L ■ k from j back to i. Note that l-connected graphs are commonly 
known as undirected graphs. 

2.2 Results 

Our first result is a preliminary lemma that shows that an equilibrium always exists. This result 
is not completely straightforward, especially in the case of infinite graphs. 

Lemma 2.13. Every game Q has an equilibrium. 

The next theorem is our main result. 

Theorem 1 (Learning). Fix ^jlq, a discount factor A G (0,1), and positive integers L and d. 
Let G be an infinite, L-connected degree d graph, let Q = (fiQ, /xi, A, G) be a game, and let Q be any 
equilibrium strategy profile of Q . Then almost surely 

a = {s} 

for all i G V. 

That is, all agents learns S and converge to the correct action. We also prove a version of 
Theorem [T] for finite graphs. 

Theorem 2 (Learning in finite graphs). Fix /j,q, a discount factor A G (0,1), and positive 
integers L and d. Then there exists a sequence q(n) = q(/j,Q, m, L, d, A, n) such that \im n q(n) = 1 
and the following holds. Let G be an L-connected graph of degree at most d with at least n vertices, 
let Q = (fj,Q, m, A, G) be a game, and let Q be an equilibrium strategy profile of Q . Then 

P [d = {S} for all i G V] > q{n). 
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An important component of the learning theorems above is the following agreement theorem, 
which stands as an independent result. 

Theorem 3 (Agreement). Let Q be a game, and let Q be an equilibrium strategy profile ofQ. Then 
there exists a random variable C such that with probability one d = C for all agents i G V . 

A result in |20| states that when likelihood ratios are unbounded (i.e., the convex closure of 
the support of the private beliefs is [0,1]) and agreement is reached (as described in Theorem [3]) 
then learning occurs in any game with a binary state of the world and conditionally i.i.d. private 
signals. Hence any cases of non-learning (necessarily in non-egalitarian networks, by Theorems [I] 
and [2] above) must feature bounded likelihood ratios. 

3 Proofs 

3.1 Rooted graphs 

In the proofs that follow we will make repeated use of the notion of a rooted graph. This section 
starts with some basic definitions and culminates in the definition of a topology on rooted graphs. 
In this we follow our previous work [TU] which builds on the work of others such as Benjamini and 
Schramm [6] and Aldous and Steele [2]. 

Definition 3.1. A rooted graph is a pair (G, i), where G = (V, E) is a directed graph, and i G V 
is a vertex in G. 

Definition 3.2. Let G = (V,E) and G' = (V',E') be directed graphs, and let (G,i) and {G',i') be 
rooted graphs. A rooted graph isomorphism between (G,i) and (G',i') is a bijection h : V — > V 
such that 

1. h(i) = %'. 

2. (j,k)eE & (h(j),h(k))eE>. 

If (G,i) and (G',i) are such that there exists a graph isomorphism between them then we 
say that they are isomorphic and write (G,i) = (G',i'). Informally, isomorphic graphs cannot be 
told apart when vertex labels are removed; equivalently, one can be turned into the other by an 
appropriate renaming of the vertices. The isomorphism class of (G, i) is the set of rooted graphs 
that are isomorphic to it, and will be denoted by [(?,£]. 

Definition 3.3. Let i,j be vertices in a directed graph G. Denote by A(i,j) the length of the 
shortest directed path from i to j. 

Note that in general A(i,j) 7^ A(J,i), since the graph is directed. 

Definition 3.4. The (directed) ball B r (G,i) of radius r of the rooted graph (G,i) is the rooted 
graph, with root i, induced in G by the set of vertices {j G V : A(i, j) < r}. 

Let £'] and [G,i] be isomorphism classes of strongly connected rooted graphs. The distance 
D([G,i],[G',i'}) is defined by 

D([G,i], [G',i'}) = inf{2- r : B r (G,i) B r (G',i')}. (1) 

It is straightforward to show that this is indeed a well defined metric; a diagonalization argument 
is needed to show that when D([G,i], [G',i']) = then (G,i) = (G',i'). Note also that while it is 
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not crucial that the graphs are strongly connected, it is crucial that there is a path from the root 
to each vertex. 

Intuitively, the larger the radius around i in which the graphs are isomorphic, the closer they 
are. In fact, the quantitative dependence of D(-, •) on r (exponential in our definition) will not be 
of importance here, and we shall only be interested in the topology induced by this metric. We will 
henceforth refer to this topology when discussing rooted graphs. 

Claim 3.5. Let {[G^,^]}^^ be a sequence of rooted graph isomorphism classes such that 

lim [G n ,i n ] = [G,i]. 

n— >oo 

Then for every r > there exists an N > such that for all n > N it holds that B r (G n ,i n ) = 
B r (G,i). Furthermore, there exists a subsequence {[G nr ,i rir ]}^ =1 such that B r (G nr ,i nr ) = B r (G,i). 

Proof. The first part of the claim follows directly from the definition of limits and Eq. [T] The 
second part holds for n r = min{n : B r (G n , i n ) — B r (G, i)}, which is guaranteed to be finite by the 
first part. □ 

3.2 Locality 

An important observation is that the actions and the utility of agent, up to time t, depends only on 
the strategies of the agents that are at distance t from it. We formalize this notion in this section. 

Claim 3.6. Let Q\ = (fio, m, \,Gi) and G(/J>o, m, A, G%) be games. Let h be a rooted graph 
isomorphism between B r+ i(Gi,ii) and -B r +i(G2, 22) f or some r > 0, and let Q 1 ^ = Q-^ for all 
ji G B r (d,n) and j 2 = h(ji)- 

Then the games, as probability spaces, can be coupled so that A] 1 = A 12 for all t < r. 

Some care needs to be taken with a statement such as "agent ji plays the same strategy as 
agent j'2" ; it can only be meaningful in the context of a bijection that identifies each neighbor of j\ 
with each neighbor of ji- We here naturally take this bijection to be h, and accordingly demand 
that it be an isomorphism between balls of radius r + 1 (rather than r), so that the neighbors of 
the agents on the surface of the ball are also mapped. 

Proof. Couple the two processes by equating the states of the world and setting Wj 1 = W^u^ for 
all ji G B r (Gi,i\), and furthermore coupling the choices of pure strategies of j\ and h{j\). 

We shall prove by induction a stronger statement, namely that under the claim hypothesis, 
A? = Af for any j x G B r (Gi,h), j 2 = h(j%) and t < r - A(h,ji). 

We prove the statement by induction on t. For t = 0, A^ 1 depends only on agent ji's private 
signal and choice of pure strategy, which are both equal to those of j 2 ■ Hence Aq 1 = Aff for all 
j G B r (Gi,ii). 

Assume now that the claim holds up to some t — 1 < r — 1. Let j\ be such that t < r — A(i\,ji). 
We would like to show that A-j. 1 = A^ 2 . Let k± be a neighbor of j\. Then t — 1 < r — A(ii, ki), and 
so A^} = A k ^ for all t' < t — 1, by the inductive assumption. Since A 3 ^ depends only on ji's private 
signals, choice of pure strategy and the actions of its neighbors in previous time periods, and since 
these are all identical to those of j'2, then it indeed follows that A^ 1 = A^ 2 . □ 

Recalling the definition 

m.t = P [A\ = S] , 
the following corollary is a direct consequence of this claim. 
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Corollary 3.7. Let Q\ = (/iQ, fii, A, G±) and Q(fJ,o, f^i, A, G2) be games. Let h be a rooted graph 
isomorphism between B r +i(Gi,ii) and B r+ i(G2,i2) for some r > 0, and let Q^ 1 = for all 
ji G B r {Gi,i\) and 22 = h(ji). 
Then m lt t = Ui 2jt for all t < r. 

3.3 A topology on strategies and the existence of equilibria 

In the following theorem we show that the agents' set of strategies admits a compact topology 
which preserves the continuity of the utilities. From this we infer the existence of equilibria for 
this game, which is not immediate, since the number of players may be infinite. We also use this 
topology to define a compact topology on equilibria, which is a component of the proof of our main 
theorem. 

For a fixed private belief Ij, a pure strategy is a function from the actions of neighbors to 
actions, which we call a response. 

Definition 3.8. Let G = (V,E) be a social network. A response at time t of an agent i G V 
is a function r^t ■ {0,1}' 9 *''* — > {0,1}. A response of an agent i is the sequence of functions 
fi = (ri,Q, rj,i, . . .). Let TZi be the space of responses of agent i. 

A (mixed) strategy of agent % can be thought of as a measure on the product space [0, 1] x TZi 
of private beliefs and responses, with the marginal on the first coordinate equaling the distribution 
of ij. Milgrom and Weber |18j call this representation a distributional strategy. They show that 
for a Bayesian game with a finite number of players, and given some conditions, the weak topology 
on distributional strategies is compact and keeps the utilities continuous. Hence, by Glicksberg's 
theorem [13], the game has an equilibrium. The next theorem shows that these conditions apply 
in our case, when the number of agents is finite. 

Lemma 3.9. Fix G = (V,E), with V finite. Then for each agent i there exists a topology % on 
its strategy space such that the strategy space is compact and the utilities Uj are continuous in the 
product of the strategy spaces. 

Proof. We prove by showing that the conditions of Theorem 1 in [18] are met. 

1. The set of private beliefs (types in the language of [IB]) is [0, 1], a complete separable metric 
space, as required. Furthermore, the distribution of private beliefs is absolutely continuous 
with respect to the product of their marginal distributions. This fulfills condition R2 of |18j . 

2. The utilities Uj are bounded, measurable functions of the private beliefs and the responses. 

3. Define a metric D on i's responses TZi by 



This can be easily verified to indeed be a metric. By a standard diagonalization argument it 
follows that TZi is compact in the topology induced by this metric, as required. 

Furthermore, for fixed private beliefs, the utilities Uj are equicontinuous in the responses: if a 
response is changed by at most S = e~ T (in terms of the metric D) then it remains unchanged 
in the first T time periods, and so the utilities are changed by at most e = (1 — A) YltLr ^ = 
A T . This fulfills condition Rl of HI]. 

Since these conditions are met, it follows by Theorem 1 in [18] that the mixed strategies of agent i 
are compact in the weak topology %, and that the utilities uj are, under %, a continuous function 
of the strategies. □ 



D(ri,r[) = exp (-min{t : r i;t / r' i t }) . 
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Note that under the above defined topology on 7Z{ the set of pure strategies is separable, and 
so the topology T% on (mixed) strategies is metrizable, e.g. with the Levy-Prokhorov metric [8]. 



Given Lemma 3.9, a direct application of Glicksberg's theorem |13j yields that every game with 
a finite number of agents has an equilibrium. The extension of this result to infinitely many agents 
is not immediate, and requires us to invoke the graphical nature of the game. We do this in the 



following section. Before turning to that we will prove an additional version of Lemma 3.9 



Lemma 3.10. Fix G = (V,E), with V finite. Then for each agent i there exists a topology % on 
its strategy space such that the strategy space is compact and the utilities in each time t, Uj t t, are 
continuous in the product of the strategy spaces. 



Proof. The proof is identical to the proof of Lemma 3.9 above, except that we let each agent's 
utility be defined by u'- = uj^; that is, we set the discount factor to be one at time t and zero 
otherwise. Since in the proof above we required of the discount factors nothing more than to have 
a finite sum, the proof still applies, and the utilities (in this case «j,t), are continuous in the product 
of the strategy spaces. □ 

3.4 The space of rooted graph strategies and its topology 



Lemma 3.9 implies, by Glicksberg's fixed point theorem [13], that any game on a finite graph has an 
equilibrium. However, Glicksberg's theorem only applies to a finite number of agents, and therefore 
we cannot use it to prove that equilibria exist in general. To that end we define we define a topology 
on rooted graph strategies. 

Definition 3.11. Let G = (V,E) and G' = (V',E') be strongly connected directed graphs. Let 
(G,i) and (G',i') be rooted graphs. Let Q and R be strategy profiles for the agents in V and V , 
respectively. We say that the triplet (G,i,Q) is equivalent to the triplet (G',i',R) if there exists a 
rooted graph isomorphism h from (G,i) to (G',i') such that Qi = R h ^ for all j G V . The space of 
rooted graph strategies QS is the set of equivalence classes induced by this equivalence relation. 
We denote an element of this space by [G, i, Q\. 

Recall that we defined above a metrizable compact topology % on agent i's strategy; let d be 
a metric that induces 71- Let i and i! be agents in graphs G and G', respectively. We can use d 
as a metric between their strategies, as long as we uniquely identify each neighbor of one with a 
neighbor of the other. Let h be a bijection between di! and di. Then dh(Q l ,Q l ) will denote the 
distance thus defined between Q l and Q l . 

We now use the metrics of strategies and of rooted graphs to define a metric on rooted graph 
strategies. 

Definition 3.12. Let [G, i,Q] and [G',i',R] be rooted graph strategies. For r G N, let H{r) be 
the (perhaps empty) set of rooted graph isomorphisms between B r (G,i) and B r (G',i'). Define the 
distance D([G,i,Q],[G',i',R\) by 



D([G, i, Q], [G</, it]) = inf {max { 2 -, ^ ^ ^ *«***>)}} . 



(2) 



Note that the choice of h G H(r + 1) and then j G B r (G, i) guarantees that h is a bijection from 
the set of neighbors of j to the set of neighbors of h(j). It is straightforward (if tedious) to show 
that D(-,-) is indeed a well defined metric. 
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Definition 3.13. The utility map u : QS — > M is given by 



u([G,i,Q]) = Ui (G,Q). 
The utility map at time t ut : QS — > K is given by 

ut([G,i,Q]) = u i>t (G,Q). 



This is a straightforward recasting of Definition |2.7| into the language of rooted graph strategy 
spaces. Note that the map is well defined, as clearly m(G,Q) = Ui<(G',R) if (G,i,Q) = (G',i',R); 
the two are simply different namings of a group of agents with an identical social network and 
identical strategies, who therefore have identical utilities. 

Lemma 3.14. The utility map u : QS — > R is continuous. 

Equivalently, if [G n ,i n ,Q n ] ->•„ [G,i,Q] then u in (G n Q n ) -)■„ Ui(G,Q). 

Proof. We will prove the claim by showing that ut is continuous. The claim will follow because, 
by the bounded convergence theorem, if / is a linear combination of the uniformly bounded maps 
{ft}t^Q, with summable positive coefficients, then the continuity of all the maps ft implies the 
continuity of /. 

Let [G n ,i n ,Q n ] ->- n [G,i,Q\. We will show that u t ([G n , i n , Q n }) ->•„ u t (G,i,Q). 

Consider a sequence of games Q n which are all played on the finite graph B — Bt-\-\{G^x). 
Since [G n ,i n ,Q n ] — > n [G,i,Q] then there exists some N such that, for n > N, it holds that 
D([G n ,i n ,Q n ], [G,i, Q)) < 2~(* +1 ). Hence, by the definition of D(-, •), it holds that B ^ B t+1 (G n ,i n ) 

for n > N. Denote by h n an isomorphism between the two balls that minimizes ma,yLj & B t (G,i) 4„ (Q 3 , Qn"^) 
as appears in the definition of /)(-, •). 

Let each agent j in B t (G,i) play Qn n ^ in Q n , and let the rest of the agents in B (i.e., those at 
distance t + 1 from i) play arbitrary strategies. Denote by R n the strategy profile played by the 



agents at game Q n , and denote by R the restriction of Q to B. By Corollary 3.7, ut([G n ,i n , Q n ]) = 
ut([B,i, R n ]) and ut([G, i, Q]) = ut([B, i, R\). Therefore it is left to show that ut([B, i, R n ]) — > n 
u t ([B,i,R\). 

Now, D([G n ,i n ,Q n ], [G, i,Q}) — > 0. Hence — > 0, and so the strategies of each agent 
j in Bt(G,i) converge to the strategy R 3 = Q 3 . Furthermore, the strategies in Bt(G,i) converge 
uniformly, since there is only a finite number of them, and so the strategy profile converges in 



the product topology. It follows that Uj ni £, which by Lemma 3.10 is a continuous function of the 



strategies in Bt(G,i), converges to u^t- □ 
Claim 3.15. Let {[G n , i n , Qn]}^=i be a sequence of graph strategies such that 
1- [G n ^iniQn] [G,i,Q]. 

2. Q n is an equilibrium strategy profile of Q n = (fio, m, X,G n ). 
Then Q is an equilibrium strategy profile of Q = (^uo, Hi, A, G). 

Proof. Let R be a strategy profile for the agents in G such that R 3 = Q J for all j ^ i. We will 
show that Ui(G, Q) > Ui(G, R). 

Let R n be the strategy profile for agents on G n defined by R n = Qn 1 for j„, ^ i n , and let 
R% = R\ Note that [G n ,i n ,Rn] ->■„ [G,i,R]. 
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Since Q n is an equilibrium profile of Q ni 

u in (G,Q n ) > u in (G,R n ). 
Taking the limit of both sides and substituting the definition of the utility map we get that 

lim u([G n ,i n ,Q n ]) > lim u(G n ,i n ,R n ). 



Finally, since by Lemma 3.14 above the utility map is continuous, we have that 

u([G,i, Q]) > u{[G,i,Rn}). 



□ 



Claim 3.16. Let {[G n , i n , Q n ]}^ =1 be a sequence of rooted graph strategies such that the limit 
lim n [G n ,i n ] exists and is equal to some [G,i]. Then there exists a subsequence {[G 



,Qn k ]}_ 

such that the limit limk[G nk ,i nk ,Q nk ] exists and is equal to [G,i,Q] for some strategy profile Q. 



n k ; °n k 



oo 

n k \Sk=l 



Proof. By Claim 3.5 there exists a subsequence {[G nr , i rir ]} ( ^ =1 such that B r (G nr ,i nr ) = B r (G,i). 
We will therefore assume without loss of generality that n r = n, i.e., limit ourselves to this subse- 
quence. Accordingly, let h n : V — > V n be a sequence rooted graph isomorphisms between B n (G,i) 
and B n (G n ,i n ). Note that since out-degrees are finite then B n (G,i) is finite for all n. 

Let j be a vertex of G, and let r,- be the graph distance between i and j. For n > ta + 1, denote 
jn = h n (j). Note that h n also maps the neighbors of j n to the neighbors of j. 

We will now construct Q, the strategy profile of the agents in G such that [G n ,i n ,Q n ] — > n 
[G, i, Q], starting with agent 1 of G. Since 71 is compact, the sequence {Qn n }%L ri+ i has a converging 
subsequence, i.e., one along which dh n (Qn n , Q 1 ) — > n for some strategy Q 1 , which we will assign 
to agent 1 in G. Likewise, this subsequence has a subsequence along which d/i n (Q^ n , Q 2 ) 0, 
etc. Thus, by a standard diagonalization argument, we have that there exists a subsequence 
{[Gn k ,in k ,Qn k ]}^ =1 with isomorphisms h nk such that dh nk {Q > n k ,Qj) for all j. It is now 
straightforward to verify that D^[G nk , i nk , Q nk ], [G, i, Q]j — >k 0: pick some r > and then k large 
enough so that h nk is an isomorphism between B r +i(G n ,i n ) and B r +i(G,i). Then by definition 
(Eq.§ 



D([G nk ,i nk ,Q nk ],[G,i,Q}) <max<|2 r , ^max d^ k (QZ* ,Q 3 



If we now further increase k then dh nk (Q J n k ,Q J ) 0, and since B r (G,i) is finite then we have 
that D ([G nk , i nk , Q nk ], [GjijQ]^ < 2~ r , for k large enough. Since this holds for all r then 

D([G nk ,i nh ,Q nk ),[G,i,Q?) ^ fc 

and 

lim [G nk ,i nk ,Qn k ] = [G,i,Q]. 



k— »oo 



□ 



Theorem 3.17 (2.13). Every game Q has an equilibrium. 
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Proof. Let Q = (fj,o, /ij, A, G). Let i be a vertex in G, denote G n = B n (G,i), and denote its root 
by i n . Let {Q n = (/xq, /ii, A, G n )}^ =1 be a sequence of finite games with equilibria strategy profiles 



Q n - Then [G n ,i n ] — >n [G,i], and so by Claim 3.16 we have that there exists a strategy profile Q 
and a subsequence {[G nk , in k ]\%^ = \ such that 



lim [G nk ,i nk ,Q nk ] = [G,i,Q\. 

fc— ^oo 



Finally, by Claim 3.15 Q is an equilibrium profile of Q. □ 



3.5 Compact graphs and graph strategies 

Definition 3.18. Let £(L,d) be the space of isomorphism classes of L- connected, degree d, strongly 
connected rooted graphs, equipped with the topology induced by the metric D(-,-). 



That is, £(L, d) is a space of egalitarian graphs. The following is a standard result (see [19J). 

Theorem 3.19. £(L,d) is compact. 

Since our topology on rooted graph isomorphism classes is metrizable, compactness implies 
that £(L,d) is also sequentially compact, i.e., that every sequence in £(L,d) has a converging 
subsequence, converging to a point in £(L,d). 

Definition 3.20. Let K, be a set of graphs. Denote by 1Z(1C) the set of rooted graph isomorphism 
classes [G, i] such that G £ K,. 

Let 1Z be a set of rooted graph isomorphism classes. Denote by QS{TZ) the set of rooted graph 
strategies [G, i, Q] such that [G, i] £ TZ and Q is an equilibrium strategy profile for G. 

Claim 3.21. Let TZ be a compact set of rooted graph isomorphism classes. Then then QS{TZ) is a 
compact set of equilibrium rooted graph strategies. 

Proof. Since our topology on graph strategies is metrizable, we will prove the claim by showing 
that any sequence of points in QS{TZ) has a converging subsequence with a limit in QS(TZ). 

Let {[G n ,i n ,Q n ]}^ =l be a sequence of points in QS(TZ). Since TZ is compact, the sequence 
{[G n , in]}%Li nas a converging subsequence {[G nk , in k ]}kLi that converges to some [G,i] £ TZ. 



Hence, by Claim 3.16, the sequence {[G n , i n , QuW^Li has a converging subsequence that, for some 



Q, converges to [G,i,Q]. Finally, by Claim 3.15 Q is an equilibrium strategy profile for G, and so 



[G, i,Q] G QS(JZ). □ 
3.6 Agreement 

Recall that the infinite action set C{ of agent i is defined by 

G = Ci(G, Q) = {s G {0, 1} : A\(G, Q) = s for infinitely many values of t}. 

We shall show in this section that it follows from the work of Rosenberg, Solan and Vieille [24J that 
any action in C{ is also an optimal action, as far as agent i can tell; every agent stops behaving 
strategically at some point. This will be useful in the proof of Theorem|3j which states that C% = Cj 
for all 

Recall that a strategy of agent i at time t is a function of its private belief Li and the actions of 
its neighbors in previous time periods, A^ t y Hence we can think of the sigma-algebra generated 
by these random variables as the "information available to agent i at time f ' : 
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Definition 3.22. Denote by 

F t =Tt(G,Q) = o-(I i ,Q i ,Afl t) ) 
the information available to agent i at time t, and denote by 

F oo =F oo (G,Q) = a(U? =0 F t ) 
the information available to agent i at the limit t — > oo. 

Note that T\ includes the sigma- algebra generated by i's private belief, the actions of i's neigh- 
bors before time t, and i's pure strategy; i knows which pure strategy it has chosen. 

Since the utility of action s at time t is P [s = S] , a myopic agent would take an action s in 
{0, 1} that maximizes P [s = SlFf ] . This motivates the following definition: 

Definition 3.23. Denote by 

B\ = B l t (G, Q) = argmaxP [s = S\T l t {G, Q)] 
se{o,i} 

the best response of agent i at time t. Likewise denote by 

= BUG, Q) = argmaxP [s = S\J^] 
se{o,i] 

the set of best responses of agent i at the limit t — > oo. 

At any time t there is indeed almost surely only one action that maximizes P [s = S\ J-\ (G, Q)] , 
since we require that the distribution of private beliefs be non atomic. This does not holds at the 
limit t — > oo, and so we let B^ be a set which can take the values {0}, {1} or {0, 1}. Note that 

i4 = {o,i}iffP[s = i|jy =\. 

The following theorem is a restatement, in our notation, of Proposition 2.1 in Rosenberg, Solan 
and Vieille [23]. 

Theorem 3.24 (Rosenberg, Solan and Vieille). Let Q = ((j,q, fj,±, A, G) be a game with equilibrium 
Q. Then for any agent i it holds that d(G,Q) C B t 00 (G,Q) almost surely. 

That is, any action that i takes infinitely often is optimal, given all the information agent i 
eventually learns. Note that this theorem is stated in |24j for a finite number of agents. However, a 
careful reading of the proof reveals that it holds equally for a countably infinite set of agents. The 
same holds for their Theorem 2.3, in which they further prove the following agreement result. 

Theorem 3.25 (Rosenberg, Solan and Vieille). Let Q = (fXo, fit, A, G) be a game with equilibrium 
Q. Let j be a neighbor of i. Then Cj(G,Q) C B l 00 {G,Q) almost surely. 

Equivalently, if i observes j, and j takes an action a infinitely often, then a is an optimal action 
for i. If we could show that = Ci for all i, it would follow from these two theorems, and the fact 
that the graph is strongly connected, that Cj = Cj for all agents i and j; the agents would agree 
on their optimal action sets. This is precisely what we shall do in the remainder of this section. 

Definition 3.26. Denote by Z\ the log-likelihood ratio of the events 5 = 1 and S = 0, conditioned 
on Flo, the information available to agent i at time t 



F[S = 


Art] 


F[S = 
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and let 



log 



P [s = 


Art] 


F[S = 


°\rt] 



Let Y£ be defined as follows: 



log 



p 


Adi 
A [0,t) 




p 


~ Adi 
A [0,t) 





where A^^ ^ e sequence of actions of i up to time t — 1, and ^4^ t ) «s i/te sequence of actions of 
i 's neighbors up to time t — 1. Finally, let 



lim K* 



t— >oo 



Note that it is not clear that the limit lim^ Y£ exists. We show this in the following claim. 
Claim 3.27. Denote by I-i the private beliefs of all agents but i. Then 

1 . lim t Z\ = almost surely. 

2. Zf = Yi + Zl 

3. hnif Y t l = Y^ almost surely, and Y^ is measurable in cr(A^ Qoo yI-i,Q). 
Proof. 1. Recall that 

zt = log 



P[S = 


Art] 


P[S = 


o\rt] 



Since {J 7 /}^ is a filtration then P [S = ll^l] is a martingale, which converges a.s. since it 
is bounded. Hence Z\ also converges, and in particular 



lim Z\ 



log 



P [S= 1|J2c 

P[5 = 0|J-* C 



2. By the definition of J 7 ^ 



p 


Adi 
A [0,t) 


Ii,Q\S = l 


P[S = 


A^Q'] 


p 


' Adi 
A [0,t) 


Ii,Q\S = 


P[S = 


0|/i,Q l ] 



where the second equality follows from Bayes' law. Now, conditioned on S and i's pure 
strategy Q\ the probability for a sequence of actions A®* ^ of z's neighbors depends on ij 



only in as much as ij affects i's actions up to time t — 1, Aj ^. Hence P 



4 9i 



Ii,Q l , S 



Adi 
A [0,t) 



A 



[0,t) 



. Note also that 



p[5 = i|jj,<y 

P[5 = 0|I i ,Q i 



14 



Therefore 



Zl = Yi + Zl 



3. Since Z\ converges almost surely and Z\ = Y£ + Zq then Y£ also converges almost surely. 
Since each is a function of A?^^ and Q\ it follows that their limit, Yj,, is measurable in 



°"(^Koo)'^)- However, given Q, Aj§ 

strategies the actions of all agents but i can be determined given their private signals and the 



[o oo) ^ s a function of i_j and AL s: for a choice of pure 



actions of i. Hence YJ, is also measurable in a(A 



[Q,do)> 



□ 



Claim 3.28. T/ie distribution of Zq is non-atomic, as is the distribution of Zq conditioned on S. 



Proof. By definition, 



log 



[S = l\h,Q l 



l [5 = 0|/ i ,Q i ] • 

However, the choice of strategy is independent of both Jj and S, and so 

[S = l|/,] , h 



log 



[S = 0\li] 



log 



l-h 



Since the distribution of Ii is non-atomic (see Definition 2.3 and the comment after it) then so is 



the distribution of Zq. Since S takes only two values then the same holds when conditioned on 
S. □ 

Theorem 3.29. Let Q = (//q, fj,%, A, G) be a game with equilibrium Q. Then almost surely B' l OCi (G, Q) - 
Ci(G,Q). 



Proof. By its definition, Cj takes values in {{0}, {1}, {0, 1}}, and by Theorem 3.24 we have that 
Ci C Bqq. Therefore the claim holds when B^ = {0} or = {1}, and it remains to show that 
Ct = {0, 1} when B^ = {0, 1}, or that P [C f / {0, 1}, B^ = {0, 1}] = 0. 

Let a = (ao, a%, . . .) be a sequence of actions, each in {0, 1}, in which only one action appears in- 
finitely often. Since there are only countably many such sequences, then if P [Cj ^ {0, 1}, B^ = {0, 1}] > 



0, then there exists such a sequence a for which 



A 



the claim by showing that 
Recall that by Claim 



A 



0,5^ = {0,1} 



3.27 



0,oo) 



{0, 1} > 0. We shall prove 



[0,oo) 

the event that = {0, 1} is equal to the event that Zq = —Y^ 



Recall also that by the same claim, Y^ is measurable in a(A 



[o,oo)^-i'Q)- Hence 



A 



[0,oo) 



a, Bl = {0,1} 



£>i I—i ; Q 



A 



[o,oo) — a ' Z o — -Y^{a,I-i,Q) 
[Z l = -Y^(a,I^,Q)\S,I-i,Q] 



< 



5*) I—ii Q 



Now, by Claim 



3.28 



Zq conditioned on S has a non-atomic distribution. Further conditioning on Q 
and I—i leaves its distribution unchanged, since it is independent of the former, and independent of 
the latter conditioned on S. Hence the probability that it equals —Y ( ^ (a,S,I-i,Q) is zero. Hence 



A 



[0,oo) 



a,Z* 



E 



A 



[0,oo) 



a,Z* 



-Y l 



S, I—i, Q 



0. 



□ 
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Our agreement theorem is a direct consequence of Theorem 



3.29 



Proof of Theorem^ Let i and j be agents. Since G is strongly connected, there exists a path from 



to j. By Theorem 3.25 we have, by induction along this path, that Cj C B^ almost surely. But 



Cj = B l by Theorem 3.29 above, and so we have that Cj C Cj. However, there also exists a path 



from j back to i, and so Cj C Cj, and the two are equal. This holds for any pair of agents, and so 
it follows that there exists a random variable C such that Cj = C for all i, almost surely. □ 



3.7 5-independence 

In this section we define a notion of near independence between two random variables, and state 
some related lemmas. The following definition is also taken (almost) verbatim from [19j. 

Let cItv('j •) denote the total variation distance between two measures defined on the same 
measurable space. Let A and B be two random variables with joint distribution fJ>(A,B)- Then we 
denote by [ia the marginal distribution of A, [is the marginal distribution of B, and [ia x the 
product distribution of the marginal distributions. 

Definition 3.30. Let (X\,X2, ■ ■ ■ ,Xk) be random variables. We refer to them as ^-independent 

if 

<fav(fi'(x 1 ,...,x k )>IJ'X 1 x ••• x n Xk ) < 8. 

I.e., the joint distribution fJ-(x 1 ,...,x k ) nas total variation distance of at most 8 from the product 
of the marginal distributions /j>x 1 x ••• x (J-x k - Likewise, (X\,...,Xi) are ^-dependent if the 
total variation distance between these distributions is more than 8. We shall also make use of the 
following notation: 

Definition 3.31. Let X\, . . . ,Xk be random variables, and let S be a binary random variable. We 
say that (X\, . . . , X^) are 6 -independent conditioned on S if they are 6 -independent conditioned on 
both S = and 5 = 1. Denote 

dep s (X\, . . . , Xk) = min{£ : (X\, . . . , Xk) are 5-independent conditioned on 5} 

Note that this minimum is indeed attained, by the definition of ^-independence. 
The proofs of the following two general claims are elementary and fairly straightforward. They 
appear in [T9] . 

Claim 3.32. Let A, B and C be random variables such that P [A ^ B] < S and (B, C) are 5'- 
independent. Then (A, C) are 28 + 5' -independent. 

Claim 3.33. Let A = (A\, . . . , A^), and X be random variables. Let (A\, . . . , Ak) be 5\-independent 
and let (A, X) be ^-independent. Then (Ai, . . . , Af.,X) are (S\ + 82) -independent. 

Definition 3.34. Let S be a binary random variable such that P [5 = 1] = 1/2. We say that the 
binary random variables {X\, ■ ■ ■ ,^fc) are (p, 8)-good estimators of S if the following hold: 

1. F[X e = S] >p, for£ = l,...,k. 

2. (X\, . . . ,-X*fc) are 8 -independent, conditioned on S. 

The following lemma captures the idea that sufficiently many (p, <5)-good estimators give arbi- 
trarily good estimates, for any p > \ and 8 small enough. 
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Claim 3.35. Let S be a binary random variable such that P [S = 1] = 1/2, and let (Xi, . . . , X k ) be 
(| + e,S)-good estimators of S. 

Then there exists a function a : {0, l} k — > {0, 1} such that 



[a(X 1 ,...,X k ) = S] > 1-e 



-2e 2 k 



Proof. Let {Y±,...,Yk) be random variables such that the distribution of (5, Yi) is equal to the 
distribution of (5, Xi) for all i, and let (Y±, . . . ,Y k ) be independent, conditioned on 5. Then 
(X\, . . . , Xk) can be coupled to (Y\, . . . , Y k ) in such a way that they differ only with probability 
5. Therefore, if we show that IP [a(Yi, . . . ,Y k ) = S] > q + 5 for some a then it will follow that 
¥[a(X 1 ,...,X k ) = S]>q. 



Denote Y = \ J2i=i an d denote ao = E 



5 = 



and ai = E 



y 



5 = 1 



It follows that 



ai - a = (2P [y = 5] - 1) > 2e. 



i=l 



By the Hoeffding bound 



and 



y < qi - e 



y > ao + e 



5 = 1 



< e 



< e 



-2e 2 k 



-2e 2 k 



Let a(y, . .. ,Y k 



Y>ai-e' 



Then by the above we have that P [a(Yi, . . . , Y k ) ^ 5] < e 



so 



[o(JCi,...,X fc ) = 5] > 1-e 



2e2fe , and 



□ 



3.8 The probability of learning 

We say that an agent "learns 5" if 5 is the only action that it takes infinitely often, i.e., if Cj = {5}. 
By Theorem [3] = Cj , and so if one agent learns 5 then all learn. We now start to explore the 
probability of learning, with the ultimate goal of proving that it equals one under certain conditions. 

Definition 3.36. Let the probability of learning map p : QS — > M be given by 

p([G,i,Q})= limP[4 ! , t (G,Q) = 5]. 

Before showing that p is well defined (i.e., the limit exists), and proving that it is lower semi- 
continuous, we make the following additional definition. 



Definition 3.37. Let 



5oo — Soo([G,i,Q]) 

be a maximum a posteriori (MAP) estimator of S given J 7 ^, for some agent i in G. 



BUG,Q) = {0} 

1 BUG,Q) = {1} 

[i bUg,Q) = {0,1} 
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Note that S tX3 ([G ! , i, Q]) is independent of i, since, by Theorem|3j = C for all agents i,j in 
G. Note also that .Soo is indeed a MAP estimator of S given 5 since by definition B^ is the set 
of most probable estimates of S, given J 7 ^. 

Claim 3.38. 

p([G,i,Q]) =p[^ 0O ([G,z,Q]) = S 
It follows that p is well defined. 



Proof. Recall that C = B^ by Theorem 3.29 Therefore 



lim P [Ai >t = S\C = {0, 1}] = lim P [A ht = SlB^ = {0, 1}] 



t— >oo 



Since the event that B 1 ^ = {0, 1} is identical to the event that P [S = l].?^] = |, and since Ai t t is 
J-^-measurable for all t, then it follows that 



limPL4 iit = S|C = {0,l}] = !. 



and also that 



lim ] 

t— »oo 



C = {0,1} 



When C = {0} or C = {1} then lim* A i.t — <Soo, and so 

lim P [A i)t = S\C^ {0, 1}] = P {Soo = S 



C^{0,1} 



Since we have equality when conditioning on both C ^ {0, 1} and C = {0, 1} then we also have 
unconditioned equality and 



p([G,i,Q]) = lim P [Ai, t = S] 



t— >oo 



Soo — S 



□ 



3.38 



Since S' 0O ([G ! , i, Q]) is independent of i then the following is a direct consequence of Claim 
Corollary 3.39. p([G,i,Q]) = p([G,j,Q]) for alli,j. 

This corollary also follows from a claim that appears in Rosenberg, Solan and Vielle [24]. 
Definition 3.40. Given /j,q and denote p*(no, Ml) = \ + \drv (M0i Ml)- 
Claim 3.41. Given \x§ and fii 

p([G,i,Q])>p*(jM>,n 1 ) > \ 
for any G, i and equilibrium strategy profile Q. 

Proof. p*(/io,Ml) > 2 ' srnce Mo 7^ Mi- Let S^o be the maximum a posteriori (MAP) estimator of S 
given i's private signal, W{. Then (see Claim 3.30 in [19J) 



Si,o — S 



P*(Mo,Mi)- 



Now, is a MAP estimator of 5 given J^^q . Since includes W{ then 



5*00 — S 



> 



S. 



i,0 



s 



P*(mo,Mi), 



and the claim follows by Claim 3.38 



□ 
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Theorem 3.42. pis lower semi-continuous, i.e., if[G n ,i n ,Q n ] — > n [G,i,Q] thenlimmf n p([G n ,i n ,Q r 
p([G,i,Q]). 

Proof. Recall that the utility of agent i at time t is given by the utility map at time t (Defini- 
tion 



3.13): 



u t ([G,i,Q])=F[Al(G,Q) = S]. 
Hence an alternative definition of p is that 

p([G,i,Q]) = lim u t ([G,i,Q}). 

t— >oo 

Now, A\ is .^-measurable. Hence, since Soq is a MAP estimator of S given , it follows that 



Sno — S 



> 



[A = s] 



or that 



Hence 



p([G,i,Q])>u t ([G,i,Q]). 



p([G,i,Q}) = sup u t ([G,i,Q}), 
t 



and since ut is continuous (see the proof of Lemma 3.14), it follows that p is lower semi-continuous. 

□ 



3.9 Proof of main theorems 

The following lemma, which we shall prove in the next section, is the main ingredient in the proof 
of our main Theorem [JJ Before stating it we would like to remind the reader that if K, is a set of 
graphs then 1Z(fC) is the s et of rooted K. graphs, and QS{1Z{K,)) is the set of K, equili brium graph 
strategies (Definition 3.20). Recall also that p*(fj,Q, = h + hdrvi^Oi (Definition 3.40). 



Lemma 3.43. Let G be an infinite, strongly connected graph such that the closure of QS (TZ({G})) is 
compact. Then for all equilibrium strategy profiles onG,k€zN,e>0 and 5 > there exists an agent 
i inG, a time t and T\ -measurable binary random variables {X\, . . . , X^) that are (p*([io, H\) — e, 8)- 
good estimators of S. 

The next theorem is a more general version of our main Theorem [TJ We shall prove it now, 
assuming Lemma 3.43 and derive from it the proof of Theorem [TJ 



Theorem 3.44. Let G be an infinite, strongly connected graph such that the closure of QS(1Z{{G})) 
is compact, and let Q be any equilibrium strategy profile of Q. Then 

F[C(G,Q) = {S}] =1. 

Proof. We first show that p([G,i, Q]) = 1. Assume by way of contradiction that p < 1. By 



Claim 3.41, we have that p > 1/2. 

for every k E N, and 5 > there exist (Xi, . . . , X^) that are J^-measurable 



By Lemma 



3.43 



for some i and t, are 5-independent conditioned on S, and are each equal to S with probability 
greater than one half, since p*{no,^\) > \. 
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By Claim 3.35 it follows that for k large enough and 5 small enough, there exists an estimator 
S of S that is a function of (X±, . . . , X^), and is equal to S with probability strictly greater than p. 

This S is J^-measurable, and so a MAP estimator of S given J 7 ^ must also equal S with 
probability greater than p. However, Soo in a MAP estimator of S given J 7 ^, and it equals S with 
probability p (Claim 3.38), and so we have a contradiction. Hence p([G,i,Q]) = 1. 

Now, by Claim [3738 we have that 



S O0 ([G,i,Q]) = S =p([G,i,Q)) = l. 



By the definition of 
by Theorem pjl C = d -- 



we have that 
= B^q, it follows that 



{S} iff BIq = {S} for some (equivalently all) i. Since, 
C = {S}] = 1. □ 



The proofs of our main theorems now follow. 

Proof of Theorem^ Since G is L-connected and of degree d, then so are all the graphs in 1Z({G}). 
By Claim 3.19 £(L,d), the set of L-connected degree d rooted graphs, is compact. Hence the 



closure of 1Z({G}) C £(L,d) is also compact, and so by Claim 3.21 we have that QS{1Z{{G})) is 
compact. The theorem therefore follows from Theorem 3.44 above. □ 



Proof of Theorem^ Let fC n be the set of L-connected, degree d graphs with n vertices. Since K, n 
is finite then Tl(JC n ) is compact, and so, by Claim 3.21, QS{K. n ) is also compact. Since the map p 
is lower semi-continuous (Theorem 3.42), then it attains a minimum on QS{JC n ). Let [G n ,i n ,Q n ] 
be a minimum point, and denote q(n) = p([G n ,i n , Q n ]). 



Claim 



By Theorem 3.19, the set of L-connected, degree d graphs is compact, and so, by again invoking 
1, we have that the sequence {[G n , i n , Q n ]}^Li has a converging subsequence that must 
converge to some infinite L-connected, degree d equilibrium graph strategy [G, i, Q]. By Theorem]!] 
we have that p([G, i, Q]) = 1, and so, by lower semi-continuity of p, it follows that 



lim 



q(n) 



lim p([G n ,i n , Q n ]) > p([G, i, Q]) = 1. 



□ 



3.10 Proof of Lemma 137431 

Before proving Lemma |3.43| we prove the following claim. 

Claim 3.45. Let [G,io,Q] be an equilibrium graph strategy. Let {i n }^=i be a sequence of vertices 
such that the graph distance A(io,i n ) diverges with n. Fix t, and for each n let X' ln be Im- 
measurable. Then 

lim dep s (XSSoo) = 0. 



Proof. Let £>* = o{{Wj,Qi : j G B r (G,i)}). We first show, by induction on r, that J> C B\: any 
J^-measurable random variable is also S*-measurable. It will follow that X ln is ^"-measurable. 
Clearly Fq C Bq. Assume now that T 3 r , C B 3 r , for all j and r' < r. By definition, T % r = 

^4f-l)- ^ or J e we have that A J r _ 1 is ZJ^-measurable. Finally, S^-i — an d so 
J';. : B[. 

Note that for i,j and r\,ri such that B ri (G,i) and B r2 (G,j) are disjoint it holds that B\ x and 
Br 2 are independent conditioned on S, since the choices of pure strategies are independent and 
private beliefs are independent conditioned on S. 
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Let R l r be a MAP estimator of Soo given £>*. Since A(io,i n ) — > n oo, it follows that for any 
r and n large enough B t (G,i n ) and B r (G ,io) a re disjoint, and so X* n and are independent, 



conditioned on S. For such n, by Claim 



3.32 



we have that (X ln ,Soo) are 2<5-independent, for 



Finally, since is ,8^-measurable, it follows that 



lim ] 

r— >oo 



and so 



lim dep s (X in , Soo) = 0. 



□ 



Proof of Lemma 3.4-3. Denote by C the closure of GS(1Z({G})). Note that by Claim 3.15 any graph 
strategy in C is in equilibrium. 

We shall prove by induction a stronger claim, namely that under the claim hypothesis, for all 
[H, j, Q] 6 C, fc £ N, e>0 and 5 > there exists an agent i in G, a time t and .^-measurable 
binary random variables (X\, . . . ,Xk) that are (e, <5)-good estimators of S. 

We prove the claim by induction on k. The claim holds trivially for k = 0. Assume that the 
claim holds up to A:. 

Let [H, j, Q] S C. Let {jn}^ = i be a sequence of vertices in H such that lim n A(j, j n ) = oo. Since 
C is compact then there exists a converging sequence [H,j n ,Q] — > n [F, i',R] £ C. By the inductive 
assumption, there exists an agent i in F, a time i and random variables (X{, . . . ,XV) which are 
J^-measurable and are (e ; , 5')-good estimators of 5, for some < e' < e and < 5' < 5. Denote 



x k 



I; 



, XI 



Let r = A(i',i). Since for n large enough B r (H,j n ) = B r (F,i') then, if we let i n 6 B r (H,j n ) be 
the vertex that corresponds to i G B r (F, i') then [if, i n , .R] — > n [F, i, fl], and lim n A(j, i n ) = oo. 
Since Xj. is J^-measurable, there exists a function a;| such that X l k = x\.(Ii, t ^). Denote 

Now, since the strategies of agents in the neighborhood of i n in H converge in the weak topology 
to those of i in F, then the random variables < (S,X't l ) > converge in the weak topology to 

I J n=l 

(S, Xi). Moreover, the measures of these random variables are over the finite space {0,l} fc+1 , 
and so we also have convergence in total variation. In particular, (X[ n , . . . , Xjj 1 ) approach 5'- 
independence: 



lim dep 5 (Xj", . . .,X£) = dep s (X[, ...,Xl)< 5'. 



Likewise, 



lim 1 

n— »oo 



XI 



s 



[X} = S] >p*( fM) ,» 1 )-e'. 



for t = 1, . . . , k. Additionally, since A(j, i n ) — > n oo, it follows by Claim 3.45 that 

lim de Vs (X^,S oo )=0, 

that is, X % £ and Soo are practically independent, for large n. 



(3) 



(4) 



(5) 
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Now, recall that Soo is J-^-measurable. Therefore, if we let R l t " be a MAP estimator of S 
given T\, then for any n it holds that 



lim I 

t'-too 



T>ln Q 

By Claim 3.32, a consequence of Eqs. [5] and [6] is that 



1. 



(6) 



lim lim dep s (Xi n , R\?) = 0. 

That is, X l fT and R l t n are practically independent, for large enough n and t'. It follows by Claim 
that 



3.33 



lim lim dep s (X> , . . . , Xl n ,R i f ?) = 8'. 

n— >oo t'— >oo 



It follows from Eq. [6] that 



lim ] 

t'— >oo 



R\? = S 



Snn — S 



> P*(A*o,Mi)- 



(7) 



(8) 



Gathering the above results, we have that for n and t? large enough, 



1. 



X l f = S 



> p*(/i , Mi) - by Eq. Q Likewise F R\? = S 



Sno — S 



by Eq. § 

2. (A^ n , . . . , X]T , ) are <5-independent, by Eq. Yt\ 



We therefore have that (X\™ , . . . , -X^™, i?*?) are J-J,™ -measurable (p*(/xo, Mi) — e, 5)-good estimators 
of 5. " □ 



4 Examples 

In this section we give two examples showing that the assumptions of bounded out-degree and 
L-connectedness are crucial. To this end we need to construct equilibrium strategies where we can 
specify some of the responses. Our approach will be to describe the initial moves of the agents and 
then extend this to a (usual sense) equilibrium strategy profile. 

Define the set of times and histories agents have to respond to as J4? = {(i,t,a) : i G V,t G 
N , a G [0, 1] x {0, l}l 9i l'*}. The set [0, 1] x {0, 1}! 9 *!'* is interpreted as the pair of the private belief 
of i and the history of actions observed by agent i up to time t. If a € [0, 1] x {0, 1}I 9 *I'* then for 
< f < t we let a t i G [0, 1] x {0, 1}! 9 *!'* denote the history restricted to times up to t' . We say 
a subset % G J? is history-closed if for every (i, t, a) G % we have that for all < t' < t that 
(i,f,at>) G U. 

For a strategy profile Q denote the optimal utility for i under any response as u*(Q) = 
sup Ui(R) where the supremum is over strategy profiles R such that Ri = Qi for all j ^ i in 
V. 

Definition 4.1. On a history-closed subset T~L G J$? a forced response q±i is a map : % — > {0, 1} 

denoting a set of actions we force the agents to make. A strategy profile Q is g%-forced if for 
every (i, t,a) G % if agent i at time t has seen history a from its neighbours then it selects action 
qu(i,t,a). A strategy profile Q is a ^-equilibrium if it is q-^-forced and for every agent i G V it 
holds that Ui(Q) > Ui(R) for any qu- forced strategy profile R such that R 3 = Q J for all j ^ i inV . 
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The following lemma can be proved by a minor modification of Lemma 2.13 and so we omit the 
proof. 

Lemma 4.2. Let % E Jti? be history-closed and let qu be a forced response. There exists a <j%- 
equilibrium. 

Having constructed q±i -equilibria we then will want to show that they are equilibria. In order 
to do that we appeal to the following lemma. 

Lemma 4.3. Let Q be a qy_- equilibrium. Suppose that for every agent i, any strategy profile R that 
attains u*(Q) has that for all t, 



F [Qt(Ii,Afl t) ) + R\{I h A\l t) UiMh,Afi t) )) e H 
Then Q is an equilibrium. 



0. 



(9) 



Proof. If Q is not an equilibrium then by compactness there exists a strategy profile for R that 
attains u* and differs from Q only for agent i. By equation ^ this implies that agent i following 
R must take the same actions almost surely as if they were following Q until the end of the forced 
moves. Hence it is g^-forced and so R is a (/-^-equilibrium. It follows that i cannot increase the 
utility of Q, which is therefore an equilibrium. □ 

In order to show that every agent follows the forced moves almost surely we now give a lemma 
which gives a sufficient condition for an agent to act myopically, according to its posterior distribu- 
tion. For an equilibrium strategy profile Q let Q\ t be the strategy profile where the agents follow 
Q except that if agent i has a = (Li, A^ t ^) then from time t onwards agent i acts myopically, taking 
action B\,{G, Q ita ) for time t' > t. We denote 



Yi = Y t {i,t,a) :=E 



S = l 



F t+ z{G, Q i t a 



F t ,a = (L u A m 



We will show that the following are sufficient conditions for agent i to act myopically. For I S 
{1, 2, 3} we set Bg = < 2Yq > — ^ and we set 



l-A 



£ 4 = <2y > \ 2 {\-y 2 ) + 



>?{\-Yz) 
1-A 



Since Q and Q ita are the same up to time t — 1 we have that Fl(G,Q) is equal to Fl(G,Q\ ta ). 
As Yn is the expectation of a submartingale it is increasing. Hence, after rearranging we see that 
BiCB 2 cg 3 c B 4 . 

Lemma 4.4. Suppose that for strategy profile Q agent i has an optimal response, such that for any 
R such that R 3 = Q J for all j 7^ i inV then Ui(Q) > Ui(R). Then for any t, 

F [A\{G, Q) ± B\,B\ UB 2 UB 3 U B 4 ] = 0, 

that is, agent i acts myopically at time t under Q almost surely, on the event B\ U B 2 U B3 U B4. 
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Proof. If agent i acts under Q\ t a then its utility from time t onwards given a is 

oo 

Ui,t,a{Q\, t , a ) := (1-A)^A*'E 



Ai,(G,Ql t ) = s] F u a=(I h Afi t) ) 



t'=t 



>(i-A)A t (i + yo + A(i + y 1 ) + A 2 (i + y 2 ) + T ^(i + y 3 )) 



under Q\ ta - Now assume that the action of agent i at time t under Q is not the myopic choice. 
Then its utility is at most 



Ui, t , a (Q) < (1 - A)A< 



AE 



1 



S= 1 



F t ,a=(I i ,A 9i 



[o,t)J 



[Al +1 (G,Q) = S] 



F t ,a = (I u A di 



[0,t)J 



+ 



1-A 



We note that at time t + 1 the information available about S is the same under both strategies 
since the only difference is the choice of action by agent i at time t, hence as i takes the optimal 
action under Q\ 



+ y 



E 



> E 



[Al +1 (G,Q) = S] 7 i t ,a = (I i ,A* ) ) 



A + i{GM{t,a) = S\ ^,a=(J 4 ,A*j) 
Since Q is optimal for i we have that 

> «i Ao (Qj Aa ) - u ht {Q) > (1 - A) A* (^o - A 2 (| - y) - - 

Condition (JlO]) does not hold under ^ 4 so P Q) / B|,Bi UB 2 UB 3 U B 4 ] = 0. 

4.1 The Royal Family 



^3) 



(10) 

□ 



In the main theorem we require that the graph G not only be strongly connected, but also L- 
connected and have bounded out-degrees, which are local conditions. In the following example the 
graph is strongly connected, has bounded out-degrees, but is not L-connected. We show that for 
bounded private beliefs asymptotic learning does not occur in all equilibria^} 

Example 4.5. Consider the the following graph. The vertex set is comprised of two groups of 
agents: a "royal family" clique of R agents who all observe each other, and n G NU {oo} agents 
- the "public" - who are connected in an undirected chain, and in addition can all observe all the 
agents in the royal family. Finally, a single member of the royal family observes one of the public, 
so that the graph is strongly connected. 

We choose that jiQ and \x\ so that P [Z l G (1, 2) U (-2, -1)] =1 and set the forced moves so 
that all agents act myopically at time 1. By Lemma |4.2| we can extend this to a forced equilibrium 



Q. By Lemma 4.3 it is sufficient to show that no agent can achieve their optimum without choosing 
the myopic action in the first round. By our choice of and [i\ we have that 



[S = 1\J%\ 



1 + el^ol 



1 , e 

2 - l + e 



1 1 

- > -. 

2 - 5 



2 We draw on Bala and Goyal's [4] royal family graph. 
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Hence in the notation of Lemma 



4.4 



we have that Yq > g when t = for all i and a almost surely. 
Moreover, after the first round all agents see the royal family and can combine their information. 
Since the signals are bounded it follows that for some c = c(/j,q, fix) > 0, independent or R and n 



E 



[s=i\j=t] 



n 



< e 



-cR 



Hence if R is a large constant £>2 holds so by Lemma 4.4 if an agent is to attain its maximal utility 
given the actions of the other agents it must act myopically almost surely at time 0. Thus Q is an 
equilibrium. 

Let J denote the event that all agents in the royal family have a signal favouring state 1. On 
this event under Q all agents in the royal family choose action 1 at time and this is observed 
by all the agents so J G T\ for all i. Since agents observe at most one other agent this signal 
overwhelms their other information and so 



[S = l\Ti,j] >l-e 



-cR 



for all i E V. Thus if R is a large constant B\ holds for all the agents at time 1 so by Lemma 4.4 
they all act myopically and choose action 1 at time 1. Since J € T\ they also all knew this was 
what would happen so gain no extra information. Iterating this argument we see that all agents 
choose 1 in all subsequent rounds. However, P [J , S = 0] > e~ c R where c' is independent of R and 
n. Hence as we let n tend to infinity the probability of learning does not tend to 1, and when n 
equals infinity the probability of learning does not equal 1. 



4.2 The Mad King 

More surprising is that there exist undirected (i.e., 1-connected) graphs with equilibria where asymp- 
totic learning fails; These graphs have unbounded out-degrees. Note that in the myopic case learn- 
ing is achieved on these graphs |19j . and so this is an example in which strategic behavior impedes 
learning. 

In this example we consider a finite graph which includes 5 classes of agents. There is a king, 
labeled u, and a regent labeled v. The court consists of Rc agents and the bureaucracy of Rb 
agents. The remaining n are the people. Note again that the graph is undirected. 

• The king is connected to the regent, the court and the people. 

• The regent is connected to the king and to the bureaucracy. 

• The members of the court are each connected only to the king. 

• The members of the people are each connected only to the king. 

• The members of the bureaucracy are each connected only to the regent. 
See Figure [T] 

As in the previous example we will describe some initial forced equilibrium and then appeal 
to existence results to extend it to an equilibrium. We suppose that /xo and m are such that 
P [Z l e (1, 1 + e) U (-V7, -V7 + e)] =1 where e is some very small positive constant, and will 
choose Rc, A and Rb so that e Rc is much smaller than which in turn will be much smaller 
than Rb- 

e Rc < — j < Rb- 
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people 




Figure 1: The Mad King. 



The equilibrium we describe will involve the people being forced to choose action in rounds 
and 1, as otherwise the king "punishes" them by withholding his information. As an incentive to 
comply he offers them the opinion of his court and, later, of his bureaucracy. While the opinion of 
the bureaucracy is correct with high probability, is it still bounded, and so, even as the size of the 
public tends to infinity, the probability of learning stays bounded away from one. 

We now describe a series of forced moves for the agents, fixing 5 > to be some small constant. 

• The regent acts myopically at time 0. If for some state sP[5= s\Ff ] > 1 - e~ 5RB then the 
regent chooses states s in round 1 and all future rounds, otherwise his moves are not forced. 

• The king acts myopically in rounds and 1 unless one or more of the people chooses action 
1 in round or 1 in which case, enraged, he chooses action 1 in all future rounds. Otherwise 
if s is the action of the regent at time 1 then from time 2 the king takes action s until the 
regent deviates and chooses another action. 

• The members of the bureaucracy act myopically in round and 1. If s is the action of the 
regent at time 1 then from time 2 the members of the bureaucracy take action s until the 
regent deviates and chooses another action. 
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The members of the court act myopically in round and 1. At time 2 they copy the action 
of the king from time 1. If s is the action of the king at time 2 then from time 3 the members 
of the bureaucracy take action s until the king deviates and chooses another action. 

The people choose action in rounds 1 and 2. At time 2 they copy the action of the king 
from time 1. If s is the action of the king at time 2 then from time 3 the people take action 
s until the king deviates and chooses another action. 



By Lemma 4.2 this can be extended to a forced equilibrium strategy Q. We will show that 
this is also an equilibrium strategy in the unrestricted game by establishing equation Q. In what 
follows when we say acts optimally or in an optimal strategy we mean for an agent with respect to 
the actions of the other agents under Q. 

First consider the regent. By our choice of n$, we have that Yq > g. Let J = Jq U J\ where 
J s denotes the event that P [S = sjj 7 ^ > 1 — e~ &Ra . Since the regent views all the myopic actions 
of the bureaucracy it knows the correct value of S except with probability exponentially small in 
Rb so for s £ {0, 1}, if 5 > is small enough, 

F[J s \S = s] > l-e- 5RB 

and hence for large enough Rb we have that Y\ > \ — 2e~ SRB which implies that B2 holds at time 



1. By Lemma 4.4 in any optimal strategy the regent acts myopically in round 0, and so follows the 
forced move. On the event J s the regent follows s in all future steps. At time 1 condition B\ holds 
so again the regent follows the forced move in any optimal strategy. We next claim that for large 
enough Rb 

' ' [S = 8\7%] > 1 - e~ 5RB ' 2 J s = 1 (11) 



Assuming (11) holds then condition B\ again holds so the regent must choose s at time 2 in 
any optimal strategy. By construction of the forced moves from time 2 onwards the king and 
bureaucracy simply imitate the regent and so it receives no further information from time 2 onwards. 



Thus again using Lemma 4.4 we see that under any optimal strategy the regent must follow its 
forced moves. 

To establish that the regent follows the forced moves in any optimal strategy it remains to show 



that Condition (11) holds. The information available to the regent at time 2 includes the actions of 
the king and the bureaucracy at times and 1. Consider the actions of the bureaucracy at times 
and 1. At time it follows its initial signal. At time 1 they also learn the initial action of the regent 
who acts myopically. By our assumption on [Xq and that P [Z$ 6 (1, 1 + e) U (— vf , — v7 + e)] = 
1, an initial signal towards is much stronger than an initial signal towards 1, since whenever Z is 
negative it is at most — y/7 + e. For i, a member of the bureaucracy, we have that Z\ > 2 if both i 
and the regent choose action 1 at time 1. However, if either i or the regent choose action at time 
1 then Z\ < —y/7 + e + 1 + e < —1. Since the actions of i and the regent at time are known to 
the regent at time 1, he gains no extra information at time 2 from his observation of i at time 1 
since he can correctly predict its action. 

The information the regent has available at time 2 is thus his information from time 1 together 
with the information from observing the king. The information available to the king is a function 
of his initial signal and that of the regent and the court. Since this is only Rc + 1 members and 
we choose Rb to be much larger than Rc it is insignificant compared to the information the regent 



observed from the court at time and hence (11) holds. Thus, there is no optimal strategy for the 
regent that deviates from the forced moves. 

As we noted above the members of the bureaucracy have \Zq\,\Z\\ > 1 almost surely. For t > 1 
let M s t denote the event that the regent chose action s for times 1 up to t. As argued above, 
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J s C Ai St t for all t under Q. This analysis holds even if a single member of the bureaucracy adopts 
a different strategy as we have taken Rb to be large so this change is insignificant. Given that M S) t 
holds, the only additional information available to agent i, a member of the bureaucracy, is their 
original signal and the action at time 1 of the regent. Thus 

p [S = s\F t ,M Sit ] > 1 - e- &RB / 2 . 



It follows then by Lemma 4.4 that acting myopically at times and 1 and then imitating the regent 
until he changes his action is the sole optimal strategy for a member of the bureaucracy. 

Next consider the forced responses of the king. Since under Q the people always choose action 
at times and 1, the rule forcing the king to choose action 1 after seeing a 1 from the people 
is never invoked. We claim that, provided Rb is taken to be sufficiently large, that the king acts 
myopically at times and 1. At time the posterior probability of S = 1 is b ounded away from 
1/2 so Yq is bounded away from while \ —Y2 < 2e~ SRn ^ 2 so by Lemma 4.4 the king must act 
myopically. Similarly at time 1 since our choice of hq and ^\ to have their log-likelihood ratio 
concentrated around either 1 or —y/7 a posterior calculation gives that, 

\Zl - #{i e du : A (Q) = 1} + V7#{i £ du : Aq = 0}| < e(2 + R c ) 

and thus for some e{Rc) > sufficiently small we can find an e'(e,Rc) > such that Yq = 
%u — \\ > e'. Since we again have that \ —Y\ < 2e~ 5RB / 2 taking Rb = Rs(e,Rc) to be 
sufficiently large B2 holds and so the king must act myopically. It remains to see that the king 
should imitate the the regent from time 2 onwards unless the regent subsequently changes his action 
in any optimal strategy. This follows from a similar analysis to the case of the members of the 
bureaucracy so we omit it. 

We next move to an agent i, a member of the court. At time the agent has Yq > — | > g. 
Agent i at time 1 views the action of the king who has in turn viewed the actions of the whole court 
at time so \ — Y2 < e~ cR ° . At time 2 the agent sees the action of the king who has imitated the 
action of the regent at time 1 so ^ — 13 < e^ SRs ^ 2 . Hence provided that Rc is sufficiently large and 
Rb(Rc, A) is sufficiently large then B4 holds and i must act myopically at time 0. The information 
of a member of the court at time 1 is a combination of their initial signal and the action of the 
king at time 1. Similarly to a member of the bureaucracy, by the choice of fio and [i\ we have that 
\Z\\ > 1 and so lo > \- Also \ — Y2 < e~ SRs ^ 2 since this includes the information from the action 
of the regent at time 1. Thus B3 holds and i must act myopically at time 1. At time 2 agents i 
knows the action of the king from round 2 so Yq > \ ~ e~ cR ° and \ — Y\ < e ~ SR B/z so g 2 holds 
and i must act myopically at time 2. Finally from time 3 onwards agent i knows the action of the 
regent at time 1. As with the king and bureaucracy this will not be changed unless i receives new 
information, that is the king changes his action sometime after time 2. Thus any optimal strategy 
of i follows the forced moves. 

This finally leaves the people. Let agent i be one of the people. We first check that it is always 
better for them to wait and just say in rounds and 1 in order to get more information from the 
king, their only source. If agent i chooses action 1 at time then the total information it receives 
is a function of the initial signals of i and the king. Thus, since the signals are uniformly bounded, 
even if the agent knew the signals exactly we would have that for some c'(^O)A'i) that the utility 
from such a strategy is at most 1 — e~ 2c . If the agents acts with at time but 1 at time 1 it 
can potentially receive information from the initial signals of the king, court and regent as well as 
it's own but still the optimal utility even using all of this information is at most 1 — e~ c ( R c+3) _ 
Consider instead the expected utility following the forced moves. On the event J agent i will have 
utility at least A 3 (l — e _<5Rs ) which is greater than 1 — e ~ c ( R c+3) provided that A is sufficiently 
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close to 1 and Rb is sufficiently large. Thus agent i must choose action at times and 1 in any 
optimal strategy. The analysis of rounds 2 and onwards follows similarly to the court and thus any 
optimal strategy of i follows all the forced moves. 

This exhaustively shows that there is no alternative optimal strategy for any of the agents which 
differs from the forced moves. Thus Q is an equilibrium. However, on the event J\ all the agents 
actions converge to 1. However, F[J",S = 0] > e~ c Rb > where c" is independent of Rc,Rb,^ 
and n. Hence, as we let n tend to infinity the probability of learning does not tend to 1. 
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